MonolayerCultures / src / AllCells / [RC17]Differential_Gene_Expression(AllCells).Rmd
[RC17]Differential_Gene_Expression(AllCells).Rmd
Raw
---
title: '[RC17]Differential_Gene_Expression(AllCells)'
author: "Nina-Lydia Kazakou"
date: "06/03/2022"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Set-up

### Load libraries

```{r message=FALSE, warning=FALSE}
library(SingleCellExperiment)
library(Seurat)
library(tidyverse)
library(Matrix)
library(scales)
library(cowplot)
library(RCurl)
library(MAST)
library(ggplot2)
library(edgeR)
library(dplyr)
library(ggsci)  
library(here)
library(viridis)
```

### Colour Palette

```{r load_palette}
mypal <- pal_npg("nrc", alpha = 0.7)(10)
mypal2 <-pal_tron("legacy", alpha = 0.7)(7)
mypal3<-pal_lancet("lanonc", alpha = 0.7)(9)
mypal4<-pal_simpsons(palette = c("springfield"), alpha = 0.7)(16)
mypal5 <- pal_rickandmorty(palette = c("schwifty"), alpha = 0.7)(6)
mypal6 <- pal_futurama(palette = c("planetexpress"), alpha = 0.7)(5)
mypal7 <- pal_startrek(palette = c("uniform"), alpha = 0.7)(5)
mycoloursP<- c(mypal, mypal2, mypal3, mypal4)
```

### Load object

```{r}
RC17_norm <- readRDS(here("data", "Processed", "AllCells", "RC17_AllCells_srt.rds"))
```

# Differential Gene Expression Analysis 
I will perform differential gene expression testing within Seurat. Seurat's default is a Wilcoxon rank sum test. However, based on previous assessments of the differential gene expression methods within our lab, as well as literature (ref), we decided that the "MAST" method fits data better. 
MAST is a GLM-framework that treates cellular detection rate as a covariate that can also work within Seurat, but the MAST package should be loaded. 
MAST uses a two-part, generalized linear model for bimodal data that parameterizes both strongly non-zero or non-detectable features; the fraction of genes expressed in a cell, should be adjusted for as a source of nuisance variation. 
More details about MAST can be found here: https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0844-5

```{r}
RC17_markers <- FindAllMarkers(RC17_norm, only.pos = TRUE, min.pct = 0.25, logfc.threshold = 0.25, term.use = "MAST")

write.csv(RC17_markers, here("outs", "Processed", "AllCells", "AllMarkers_AllCells.csv"))
```

## Visualise the top differentially expressed genes
```{r fig.height=15, fig.width=15}
top5 <- RC17_markers %>% group_by(cluster) %>% top_n(n = 5, wt = avg_log2FC)

cluster_averages <- AverageExpression(RC17_norm, group.by = "ClusterID",
                                      return.seurat = TRUE)

cluster_averages@meta.data$cluster <- factor(levels(RC17_norm@meta.data$ClusterID))

saveRDS(cluster_averages, here("data", "Processed", "AllCells", "RC17_AllCells_AvgClu.rds"))

DoHeatmap(object = cluster_averages, features = top5$gene, label = TRUE, group.by = "cluster",  group.colors = mycoloursP[6:40], draw.lines = F, size = 4) + NoLegend() + theme(text = element_text(size = 10, face = "bold")) + scale_fill_viridis(discrete=FALSE) 
```

```{r}
sessionInfo()
```